Skip to content

Conversation

@carlydf
Copy link
Contributor

@carlydf carlydf commented Nov 14, 2025

What changed?

Improve versioning error messages, return NotFound gRPC error for all not found errors

Why?

So that users can see if they mispelled a deployment name or build id, and generally know which versions are involved when there is an error

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

Changing the error returned only affected replay if the type of error changes, not the string. I am keeping the type the same in the few places where I had to change the error string, so it won't affect workflow replay.

Error messages are not compared when checking order of commands in workflow replay check in Go SDK here: https://github.com/temporalio/sdk-go/blob/ad990a3bfe17560c82cf84c80dfdf6c41eb43580/internal/internal_task_handlers.go#L1616


Note

Standardizes NotFound errors and makes worker deployment/version errors more informative (include deployment/build IDs), with corresponding workflow validations and test updates.

  • Worker Deployment (server):
    • Error Semantics: Use NotFound for missing deployments/versions (e.g., ErrWorkerDeploymentNotFound, ErrWorkerDeploymentVersionNotFound), replacing prior FailedPrecondition in those cases.
    • Richer Messages: Parameterize errors with deployment/version info (build_id, deployment name); add formatted messages for draining/has-pollers/current-or-ramping; introduce suffix constants and string matching to map workflow failures to precise non-retryable errors.
    • APIs Affected: SetCurrentVersion, SetRampingVersion, DescribeVersion, DeleteVersionFromWorkerDeployment, and update handling (handleUpdateVersionFailures).
    • Validation Updates (workflow): Return FailedPreconditionf with version details for missing task queues, manager identity mismatch, and current/ramping delete checks.
  • Tests:
    • Adjust expectations to new NotFound usage and formatted messages; update assertions to use FailedPreconditionf message strings.

Written by Cursor Bugbot for commit 5bb0b22. This will update automatically on new commits. Configure here.

@carlydf carlydf requested review from a team as code owners November 14, 2025 01:14
@carlydf carlydf changed the base branch from main to shahab/sync-apis November 14, 2025 01:14
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Ramping crashes on empty version configuration

In SetRampingVersion, when version is empty (unsetting ramping) and allowNoPollers is true, versionObj is nil. Calling versionObj.GetBuildId() at line 800 in updateWithStartWorkerDeployment causes a nil pointer dereference. The code should check if versionObj is nil before dereferencing or handle empty version specially.

service/worker/workerdeployment/client.go#L792-L803

deploymentName,
versionObj.GetBuildId(),
&updatepb.Request{
Input: &updatepb.Input{Name: SetRampingVersion, Args: updatePayload},
Meta: &updatepb.Meta{UpdateId: updateID, Identity: identity},
},
identity,
requestID,
d.getSyncBatchSize(),
)
if err != nil {
return nil, err

Fix in Cursor Fix in Web


Base automatically changed from shahab/sync-apis to main November 14, 2025 15:20
@carlydf carlydf merged commit 93e43e5 into main Nov 25, 2025
59 checks passed
@carlydf carlydf deleted the versioning-error-handling-tweaks branch November 25, 2025 01:17
Shivs11 added a commit that referenced this pull request Nov 26, 2025
… not found errors (#8641)

## What changed?
Improve versioning error messages, return NotFound gRPC error for all
not found errors

## Why?
So that users can see if they mispelled a deployment name or build id,
and generally know which versions are involved when there is an error

## How did you test it?
- [ ] built
- [ ] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)

## Potential risks
Changing the error returned only affected replay if the type of error
changes, not the string. I am keeping the type the same in the few
places where I had to change the error string, so it won't affect
workflow replay.

Error messages are not compared when checking order of commands in
workflow replay check in Go SDK here:
https://github.com/temporalio/sdk-go/blob/ad990a3bfe17560c82cf84c80dfdf6c41eb43580/internal/internal_task_handlers.go#L1616


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Standardizes NotFound errors and makes worker deployment/version
errors more informative (include deployment/build IDs), with
corresponding workflow validations and test updates.
> 
> - **Worker Deployment (server)**:
> - **Error Semantics**: Use `NotFound` for missing deployments/versions
(e.g., `ErrWorkerDeploymentNotFound`,
`ErrWorkerDeploymentVersionNotFound`), replacing prior
`FailedPrecondition` in those cases.
> - **Richer Messages**: Parameterize errors with deployment/version
info (`build_id`, deployment name); add formatted messages for
draining/has-pollers/current-or-ramping; introduce suffix constants and
string matching to map workflow failures to precise non-retryable
errors.
> - **APIs Affected**: `SetCurrentVersion`, `SetRampingVersion`,
`DescribeVersion`, `DeleteVersionFromWorkerDeployment`, and update
handling (`handleUpdateVersionFailures`).
> - **Validation Updates (workflow)**: Return `FailedPreconditionf` with
version details for missing task queues, manager identity mismatch, and
current/ramping delete checks.
> - **Tests**:
> - Adjust expectations to new `NotFound` usage and formatted messages;
update assertions to use `FailedPreconditionf` message strings.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5bb0b22. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Shivam Saraf <[email protected]>
Co-authored-by: ShahabT <[email protected]>
lina-temporal pushed a commit that referenced this pull request Nov 26, 2025
… not found errors (#8641)

## What changed?
Improve versioning error messages, return NotFound gRPC error for all
not found errors

## Why?
So that users can see if they mispelled a deployment name or build id,
and generally know which versions are involved when there is an error

## How did you test it?
- [ ] built
- [ ] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [ ] added new functional test(s)

## Potential risks
Changing the error returned only affected replay if the type of error
changes, not the string. I am keeping the type the same in the few
places where I had to change the error string, so it won't affect
workflow replay.

Error messages are not compared when checking order of commands in
workflow replay check in Go SDK here:
https://github.com/temporalio/sdk-go/blob/ad990a3bfe17560c82cf84c80dfdf6c41eb43580/internal/internal_task_handlers.go#L1616


<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Standardizes NotFound errors and makes worker deployment/version
errors more informative (include deployment/build IDs), with
corresponding workflow validations and test updates.
> 
> - **Worker Deployment (server)**:
> - **Error Semantics**: Use `NotFound` for missing deployments/versions
(e.g., `ErrWorkerDeploymentNotFound`,
`ErrWorkerDeploymentVersionNotFound`), replacing prior
`FailedPrecondition` in those cases.
> - **Richer Messages**: Parameterize errors with deployment/version
info (`build_id`, deployment name); add formatted messages for
draining/has-pollers/current-or-ramping; introduce suffix constants and
string matching to map workflow failures to precise non-retryable
errors.
> - **APIs Affected**: `SetCurrentVersion`, `SetRampingVersion`,
`DescribeVersion`, `DeleteVersionFromWorkerDeployment`, and update
handling (`handleUpdateVersionFailures`).
> - **Validation Updates (workflow)**: Return `FailedPreconditionf` with
version details for missing task queues, manager identity mismatch, and
current/ramping delete checks.
> - **Tests**:
> - Adjust expectations to new `NotFound` usage and formatted messages;
update assertions to use `FailedPreconditionf` message strings.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
5bb0b22. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Shivam Saraf <[email protected]>
Co-authored-by: ShahabT <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants